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Abstract: A quantitative method is suggested, where meanings of words, 
and grammatic rules about these, of a vocabulary are represented by real 
numbers. People meet randomly, and average their vocabularies if they are 
equal; otherwise they either copy from higher hierarchy or stay idle. Presence 
of teachers broadcasting the same (but arbitrarily chosen) vocabulary leads 
the language formations to converge more quickly. 

Introduction: Within the emerging physics literature on languages [1- 
12], birth of a language may be observed as a scarcely studied issue. In our 
opinion, the subject is important for researches on language competition, 
since quickly developed languages may have more chance to survive and 
to spread. In the present contribution, effect of inequality on the speed 
of originating a language is studied, where some social agents (hierarchical 
people, teachers) play the role of nucleation centers for clustering of words, 
meanings, and grammatic rules, etc. We present a quantitative model, where 
each subentry of a vocabulary is represented by a real number, and so are the 
words. Model is given in the following section; applications and results are 
displayed in next one. Last section is devoted for discussion and conclusion. 

Model: We have a society composed of N adults. Each person k has a 
vocabulary of M words (wki, k < N, i < M). For a word there exist many re- 
lated items as meanings, rules for plural forms, adverb forms, tenses, prefixes, 
suffixes, etc. In real life, many words have a diversity of such peculiarities, 
which are not all easy to learn and to remember; since their meanings may 
be close to each other, as "dictionary" and "vocabulary". Pronunciations 
may be similar too; as "a head", and "ahead". Also; as, "night", "knight", 
and "knife". Such variations are symbolized by five different subentries (j) 
at most. So we take 1 < j < j max for every word w and for each j we assign 
a representative real number r. Therefore our words are sets of up to five 
real numbers: 

w k i = {r k ij}- (1) 
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The maximum number j max of subentries r is also determined randomly 
between 1 and 5, independently for any word w. Clearly, r ki j = (w k i = 
{0, 0, 0, 0, 0}) corresponds to an unknown meaning (word) in the vocabulary 
of the adult k. 

Initially there is no consensus about a common vocabulary, but the con- 
sensus may be set through several processes described in the following sub- 
sections, and the values for initial r^'s in Eq. (2) must be changed into time 
dependent ones, i.e. rkij(t). 

Evolution of the language spoken by any adult may be described by 

M Jmax 

M*) = E E ( 2 ) 

i=l j=l 

wherex, L k varies from person to person, especially at the beginning of the 
formation period, and this fluctuation fades down with time since L k — > L, 
if convergence occurs. 

Eq. (2) may be summed over the members (k) of the society to consider 
all the vocabularies present at time t: 

N 

£>(f)=EM0- (3) 

k 

As we observed, D(t) — D(t — 1) is a significant quantity within the present 
formalism, and we represent it by V(t): 

V(t) = D(t)-D(t-l). (4) 

As t — > oo, D(t) is expected to converge to its limit D(t — > oo), and L k (t) 
to some L, and V(t) to zero. Then, the language L = L k {t — > oo) may be 
evaluated as established. Minor fluctuations within D(t) about D(t — > oo), 
and these within V(t) about zero may be attributed to misuses due to lack of 
individual memories to remember all the relevant meanings, and rules, etc. 

Initiation : We assign random real numbers for initial values of r ki j, with 
< r kij < 1, where k < N,i < M, and j < j ma x(k, i). 

Evolution : Once the initial vocabularies are set, we assume that two mem- 
bers (k, and fc') meet randomly at a time t. 

In the simplest case of no inequality in status (equality), they average 
[13-14] subentries (r ki j) in their vocabularies, and share the new ones: 

r kij (t) = (r kij (t - 1) + ry (t - l))/2 = ry (t), (5) 
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and the language spoken by each adult (k) becomes: 

L fc (t) = L k ,{t) = {L k (t - 1) + L k ,{t - l))/2. (6) 

As interaction tours (time t) advance, T k ij(t — > oo) = 1/2 independent of 
the subindices. We have D(t) = D(0) and V(t) = V(0) = 0, for all t, since 
L fc (i) + L k ,{t) = L k (t - 1) + L fc ,(i - 1), due to Eq. (6). 

We incorporate inequality into the society, by assigning some rank to 
adults in terms of real numbers (greater than or equal to zero, and less than 
one) determined randomly. Yet, any two adults will be considered as equiv- 
alent if their ranks are close to each other by a given A, and each member 
will average her vocabulary with the other, Eq. (5) and (6). Otherwise, the 
one with lower rank (obeying) will copy down the vocabulary of the other 
(commander) and take it as her new vocabulary, till another possible meet 
with any adult occurs. In this case convergence (formation) of the language 
may be speeded up under certain conditions, as studied within the following 
section. 

Furthermore, we may assume more stringent inequality: Some hierarchy 
(all, with rank of unity) broadcast the same (yet arbitrarily selected) vocab- 
ulary to the society, from the beginning on. We call them teachers. They 
will not change their common vocabulary and due to their ultimate rank, 
they will not average their vocabularies with anyone. Some other hierarchi- 
cal people (within a given limit of A) may average their vocabularies after 
they discuss with teachers. And the rest copies down from all those who have 
higher ranks by A. 

Applications and Results: In this section we will first consider unique- 
ness within society. Later, by assigning to each individual a random real num- 
ber (rank; greater than or equal to zero, and less than one) we will establish 
hierarchy. And finally, we will incorporate some teachers with ultimate rank 
of unity into society. 

We handled equality within adults by assuming an averaging process for 
the words (w k i of Eqs. (1), (2), and (5)), and the meanings (r ki j of Eq. 
(1)) [13-14]. Evolution of r ki j(t), for a randomly selected j is displayed in 
Figure 1, where adults (N = 500) are all equal and only arbitrarily chosen 
hundred adults are displayed. Each adult had her own initial randomly 
selected meanings (r ki j(0)) as used by herself and suggested to the society. 
Whenever any two of the adults randomly meet, they obey Eq. (5); each 
interacts equally with the other and averages her vocabulary. D(t) = D(0) 



3 



and V(t) = V(0) = 0, for all t, since D(t) of Eq. (3) does not change during 
interactions L k (t) + Lj,r(t) = L^(t — 1) + L^(t — 1), Eq. (6)). Corresponding 
probability density function (PDF) for rkij(t) (with N = 500, M = 100 and 
j < jmax) is a delta function, i.e., PDF(V) = 5(0) (inset, Fig. 1.). 

We incorporate inequality into the society, by assigning some rank to 
adults in terms of real numbers (greater than or equal to zero, and less 
than one) determined randomly. Yet, any two adults will be considered as 
equivalent if their ranks are close to each other by a given A. Under the 
present condition, each member will average her vocabulary with the other, 
Eq. (3). Otherwise, the one with lower rank will copy down the vocabulary 
of the other and take it as her new vocabulary, till another possible meet 
with any adult occurs. 

For small A, almost everybody (except the top of hierarchy with rank 
1 — A) may copy from others, and almost everybody (except the bottom of 
hierarchy with rank A) may be copied by others. Within this content, the 
averaging process between equals is ignored within the society (N). On the 
other hand if 0.5 < A, only the top of hierarchy with rank 1 — A will be 
copied by the bottom of that with rank A, and more than half of the society 
will average. Clearly, averaging process will dominate as 0.5 <C A — > 1.0; 
therefore this regime implies more freedom and more discussion. A = 1.0 
case corresponds to equality of all the adults. 

Evolutions of r ki j(t) with various N, and M as designated in the figure 
captions and j < j max , and D(t), and V(t) are displayed, in Figures 2a, and b, 
and c, respectively, where A = 0.2 for all. One may remark that, discussing 
and averaging mechanism between (close) equals (by A), or copying from the 
vocabulary of some higher rank people causes the language to converge, yet 
convergence is very small for A ~ 0, and speeds up as A — > 1. For A ~ 0, 
all the society speaks ultimately the language of the one with highest rank 
which is very close to unity. 

Teachers : Figure 3a displays evolution of r ki j(t), with randomly selected 
number of meanings (j < j max ), where the meanings of words belong to 
language of arbitrarily chosen hundred adults out of N — 5000 adults. Please 
note the horizontal limiting line representing the language broadcasted by 
teachers. The greater the distance from this line is, the greater is the needed 
effort to learn the language. Figure 3b displays D(t), and Figure 3c displays 
V(t), with A = 0.2 and r = 0.2 in all, where r designates the number of 
teachers per population of the society (JV). Please note the rapid convergence 
in D(t) and V(t). 
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In Figure 3c there exist three behaviors in V(t): For t ~ region we 
have comparably big fluctuations; for t — > oo we have very small fluctua- 
tions, both about zero. And in between we have exponential decay. Initial 
fluctuations originate from randomness, and the number of equilibrated ones 
may be increased by increasing the number of tours (and also, precision of 
real numbers in the utilized software). So, the characteristic regime is the 
intermediate one and exponential decay implies that the envelope function 
for D(t) (which passes through local maxima and minima) is also an expo- 
nentially decaying one. (We had observed similar exponential decays within 
our computations on opinion dynamics. [16]) The pronounced threefold be- 
havior is reflected in PDF's in Figure 3c; where, the horizontal axis is for V 2 , 
and the perpendicular one is logarithmic. 

Small-speed regimes in PDF's of Figure 3d correspond to t — > oo region in 
V(t), which may be ignored totally. Please note that PDF(V) (and PDF(V 2 )) 
goes to 5, as A approaches unity and high speed wing tips in PDF's are 
coming from t ~ region in V(t), where randomness is dominant. Teachers 
shape the intermediate region, and due to them we have the exponential 
convergence in D(t). And one new language emerges, which is spoken by the 
majority of adults, and will be learned by children. 

Discussion and Conclusion: Clearly, increasing the number of teach- 
ers (and r) increases rates of exponential decays in V(t) and D(t): There 
will be more chance to check personal vocabularies, and number of ordinary 
adults will be lowered. Big differences between the real numbers associated 
to entries of the broadcasted common vocabulary and those to initial settings 
may be considered as a kind of measure for difficulty to learn the relevant 
language, since more interaction tours will be needed for averaging before 
the personal vocabularies approach the broadcasted one. If the equilibrium 
level of D(t) is far from the initial one, then the emerging language may be 
considered as a tough one to learn. (We run the case, with 0.9 < r kij < 1.0, 
and 0.0 < r ki j < 0.1 (Eq. 3) for the teachers' vocabulary many times and 
verified the last remark in all.) 

We run also the case, where each teacher broadcasted (keeping her ul- 
timate rank) a different vocabulary, rather than a common one. This case 
corresponds to a richer language. And we obtained still, but rather slower, 
exponential decays. In our opinion, this result agrees well with the reality 
that those languages involving more words and grammatic rules are harder 
to learn than those with less words and rules. In any case, presence of nuclei 
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speeds up clustering of words and rules; and the relevant language emerges 
quickly. So, when a group of people immigrate to a new society, and if they 
gain rank (power) they may broadcast their language to the present soci- 
ety, which may be considered as one of the possible mechanism to spread 
languages besides colonization, conquest, etc. As a final remark it may be 
stated that we varied the number of words (upper limit within the sum of 
Eq. (2)) and the number of adults (N) within the society from 10, 100 to 
1000, 5000, all respectively and obtained similar results. As the numbers 
decreased, fluctuations increased; yet, the envelope of D(t) always came out 
as exponential. 
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FIGURES 

Figure 1 Evolution of J^t* r 'kij(t), for three adults, where j is arbitrary with 
j < jmax, and M = 100, for N = 500. Inset shows PDF for time rate of 
change of r kij (t). 

Figure 2a Evolution of J2f fkij(t): for three adults, where j is arbitrary and 
M = 100, for N = 1000, where A = 0.2. 

Figure 2b Evolution of D(t) with M = 300, JV = 5000, for A = 0.2. 
Figure 2c Evolution of V(t) with M = 300, JV = 5000, for A = 0.2. 
Figure 3a Evolution of r 'kij(t), for three adults, where j is arbitrary and 
M = 300, for N = 1000, with A = 0.2, r = 0.2. 

Figure 3b D(i) with A = 0.2 and r = 0.2. Please notify the rapid conver- 
gence. 

Figure 3c V(t) with A = 0.2 and r = 0.2. Please notify the rapid conver- 
gence in V{t). Perpendicular axis for V(t) is logarithmic. The inset shows 
PDF for the given V(t). 

Figure 3d PDF(V) for A = 0.2, and various r, where the horizontal axis is 
V 2 , and the perpendicular one is logarithmic. 
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